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Neyman-Pearson Detection of Gauss-Markov Signals in Noise: 
Closed-Form Error Exponent and Properties 



Youngchul Sung, Lang Tongt, and H. Vincent Poor 

Abstract 

The performance of Neyman-Pearson detection of correlated random signals using noisy observations 
is considered. Using the large deviations principle, the performance is analyzed via the error exponent 
for the miss probability with a fixed false-alarm probability. Using the state-space structure of the signal 
and observation model, a closed-form expression for the error exponent is derived using the innovations 
approach, and the connection between the asymptotic behavior of the optimal detector and that of the 
Kalman filter is established. The properties of the error exponent are investigated for the scalar case. 
It is shown that the error exponent has distinct characteristics with respect to correlation strength: for 
signal-to-noise ratio (SNR) > 1, the error exponent is monotonically decreasing as the correlation becomes 
strong while for SNR < 1 there is an optimal correlation that maximizes the error exponent for a given 
SNR. 

Index Terms — Error exponent, Neyman-Pearson detection, Correlated signal, Gauss-Markov model, 
Autoregressive process. 
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I. Introduction 

In this paper, we consider the detection of correlated random signals using noisy observations 
Hi under the Neyman-Pearson formulation. The null and alternative hypotheses are given by 

H : yi = Wi, i = l,2,---,n, 
Hi : yi = Si + Wi, 

where {wi} is independent and identically distributed (i.i.d.) A/"(0, a 2 ) noise with a known 
variance a 2 , and {sj} is the stochastic signal process correlated in time. We assume that {si} is 
a Gauss-Markov process following a state-space model. An example of an application in which 
this type of problem arises is the detection of stochastic signals in large sensor networks, where it 
is reasonable to assume that signal samples taken at closely spaced locations are correlated, while 
the measurement noise is independent from sensor to sensor. In this paper, we are interested in 
the performance of the Neyman-Pearson detector for the hypotheses (1) with a fixed level (i.e., 
upper-bound constraint on the false-alarm probability) when the sample size n is large. 

In many cases, the miss probability Pm of the Neyman-Pearson detector with a fixed level 
decays exponentially as the sample size increases, and the error exponent is defined as the rate 
of exponential decay, i.e., 

K= lim — logP M (2) 

n— >oo n 

under the given false-alarm constraint. The error exponent is an important parameter since it 
gives an estimate of the number of samples required for a given detector performance; faster 
decay rate implies that fewer samples are needed for a given miss probability, or that better 
performance can be obtained with a given number of samples. Hence, the error exponent is a 
good performance index for detectors in the large sample regime. For the case of i.i.d. samples 
where each sample is drawn independently from the common null probability density po or 
alternative density pi, the error exponent under the fixed false-alarm constraint is given by the 
Kullback-Leibler information -D(pollPi) between the two densities po and pi (C. Stein [29]). For 
more general cases, the error exponent is given by the asymptotic Kullback-Leibler rate defined 
as the almost-sure limit of 

-log^(yi,---,y„) asn^oo, (3) 
n pi, n 

under po <n , where po, n and pi ;H are the null and alternative joint densities of j/i, ■ ■ ■ ,y n , respec- 
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tively, assuming that the limit exists* [30-34]. However, the closed- form calculation of (3) is 
available only for restricted cases. One such example is the discrimination between two autore- 
gressive (AR) signals with distinct parameters under the two hypotheses [34,35]. In this case, 
the joint density, Pj,m is easily decomposed using the Markov property under each hypothesis, 
and the calculation of the rate is straightforward. However, for the problem of (1) this approach 
is not available since the observation samples under the alternative hypothesis do not possess 
the Markov property due to the additive noise, even if the signal itself is Markovian; i.e., the 
alternative is a hidden Markov model. 

A . Summary of Results 

Our approach to this problem is to exploit the state-space model. The state-space approach in 
detection is well established in calculation of the log-likelihood ratio (LLR) for correlated signals 
[6,7]. With the state-space model, the LLR is expressed through the innovations representation 
[11] and the innovations are easily obtained by the Kalman filter. The key idea for the closed- 
form calculation of the error exponent for the hidden Markov model is based on the properties 
of innovations. Since the innovations process is independent from time to time, the joint density 
under H± is given by the product of marginal densities of the innovations, and the LLR is given by 
a function of the sum of squares of the innovations; this functional form facilitates the closed- form 
calculation of (3). 

By applying this state-space approach, we derive a closed-form expression for the error ex- 
ponent K for the miss probability of the Neyman-Pearson detector for (1) of fixed false-alarm 
probability, Pp = a. 

We next investigate the properties of the error exponent using the obtained closed-form ex- 
pression. We explore the asymptotic relationship between the innovations approach and the 
spectrum of the observation. We show that the error exponent K is a function of the signal-to- 
noise ratio (SNR) and the correlation, and has different behavior with respect to (w.r.t.) the 
correlation strength depending on the SNR. We show a sharp phase transition at SNR = 1: at 
high SNR, K is monotonically decreasing as a function of the correlation, while at low SNR, on 
the other hand, there exists an optimal correlation value that yields the maximal K. 

We also make a connection between the asymptotic behavior of the Kalman filter and that 

of the Neyman-Pearson detector. It is shown that the error exponent is determined by the 
*Ergodic cases are examples for which this limit exists. 
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asymptotic (or steady-state) variances of the innovations under Hq and Hi together with the 
noise variance. 

B. Related Works 

The detection of Gauss-Markov processes in Gaussian noise is a classical problem. See [5] 
and references therein. Our work focuses on the performance analysis as measured by the error 
exponent, and relies on the connection between the likelihood ratio and the innovations process 
as described by Schweppe [6]. In addition to the calculation of the LLR, the state-space approach 
has also been used in the performance analysis in this detection problem. Exploiting the state- 
space model, Schweppe obtained a differential equation for the Bhattacharyya distance between 
two Gaussian processes [8-10], which gives an upper bound on the average error probability 
under a Bayesian formulation. 

There is an extensive literature on the large deviations approach to the analysis of the detection 
of Gauss-Markov processes [19-28]. Many of these results rely on the extension of Cramer's the- 
orem by Gartner and Ellis [15-18] and the properties of the asymptotic eigenvalue distributions 
of Toeplitz matrices [12,13]. To find the rate function, however, this approach usually requires 
an optimization that requires nontrivial numerical methods except in some simple cases, and the 
rate is given as an integral of the spectrum of the observation process; closed-form expressions 
are difficult to obtain except for the case of a noiseless autoregressive (AR) process in discrete- 
time and its continuous-time counterpart, the Ornstein-Uhlenbeck process [19-27]. In addition, 
most results have been obtained for a fixed threshold for the normalized LLR test, which results 
in expressions for the rate as a function of the threshold. For ergodic cases, however, the nor- 
malized LLR converges to a constant under the null hypothesis and the false alarm probability 
also decays exponentially for a fixed threshold. Hence, a detector with a fixed threshold is not 
optimal in the Neyman- Pearson sense since it does not use the false-alarm constraint fully; i.e., 
the optimal threshold is a function of sample size. 

C. Notation and Organization 

We will make use of standard notational conventions. Vectors and matrices are written in 
boldface with matrices in capitals. All vectors are column vectors. For a scalar z, z* denotes 
the complex conjugate. For a matrix A, A T and A^ indicate the transpose and Hermitian 
transpose, respectively. det(A) and tr(A) denote the determinant and trace of A, respectively. 
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A(7,m) denotes the element of the Ith row and mth column, and {Afe(A)} denotes the set of all 
eigenvalues of A. We reserve I m for the identity matrix of size m (the subscript is included only 
when necessary). For a sequence of random vectors x ra , Ej{x n } is the expectation of x„ under 
probability density Pj. n , j = 0,1. The notation x ~ A/"(/x, S) means that x has the multivariate 
Gaussian distribution with mean /x and covariance S. 

The paper is organized as follows. The data model is described in Section II. In Section III, 
the closed-form error exponent is obtained via the innovations approach representation. The 
properties of the error exponent are investigated in Section IV, and the extension to the vector 
case is provided in Section V. Simulation results are presented to demonstrate the predicted 
behavior in Section VI, followed by the conclusion in Section VII. 

II. Data Model 

For the purposes of exposition, we will focus primarily on the case in which the signal is 
generated by a scalar time-invariant state space model. The more general vector case will be 
considered below. In particular, we assume that the signal process {si} has a time-invariant 
state-space structure 

s i+1 = asi + Ui, i = l,---,n, (4) 
ai ~ Af(0, H,), 
Ui l ~ ' AA(0, Q), Q = n (l - a 2 ), 

where a and Ho are known scalars with < a < 1 and Ilo > 0. We assume that the process noise 
{ui} is independent of the measurement noise {wi} and the initial state si is independent of u\ 
for all i. Notice that the signal sequence {sj} forms a stationary process for this choice of Q. 
Due to this stationarity, the signal variance is Ilo f° r ah h an d the SNR V for the observations 
is thus given by 

r-£ (5) 

Notice that the value of a determines the amount of correlation between signal samples. For an 
i.i.d. signal we have a = and all the signal power results from the process noise {ui}. When 
the signal is perfectly correlated on the other hand, a = 1 and the signal depends only on the 
realization of the initial state si. The autocovariance function r s (-) of the signal process {si} is 
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given by 

r s (i-j)^E{s i s j } = n a\ i -i\. (6) 

As seen in (1), the observation yi under the alternative hypothesis is given by a sum of signal 
sample Si and independent noise u>j. Thus, the observation sequence {yi} under H\ is not a 
Markov process due to the presence of the additive noise even if the signal is Markovian. Let 
Ty\-) denote the autocovariance function of the observation process {yi} under Hj, i.e., 

r^(m -n) = Ej{y m y n }, (7) 

and let Sy\u>) be the spectrum of the observation process under Hj, i.e., 

oo 

Sjp{u) = 4^(k)e- jkuJ , -tt<uj<tt. (8) 

k=— oo 

Then, the spectra of the observation process under Hq and H\ are given by 

S y °\uj) = a 2 , sW(u) = a 2 + S s {u), - vr < cu < n, (9) 

where the signal spectrum under the state-space model is given by the Poisson kernel: 

„ . . n (i - a 2 ) . . 

1 — 2a cos co + a z 

III. Error Exponent for Gauss-Markov Signal in Noise 

In this section, we derive the error exponent of the Neyman-Pearson detector with a fixed level 
a £ (0, 1) for the Gauss-Markov signal described by (4) embedded in noisy observations. 

A general approach to the error exponent of Neyman-Pearson detection of Gaussian signals 
can be framed in the spectral domain. It is well known that the Kullback-Leibler information 
between two zero- mean Gaussian distributions po = Af(0, ctq) and pi = M(0, erf) is given by 



1 a? 1 ffn 1 



As noted above, this quantity gives the error exponent in the case of an i.i.d. Gaussian signal. 
In more general cases with correlated Gaussian signals, the error exponent can similarly be 
obtained using the asymptotic properties of covariance matrices. Let y n be the random vector 
of observation samples yi defined as 

yn = [yx,V2,--- ,yn) T - (12) 

February 1, 2008 DRAFT 



TO APPEAR IN IEEE TRANS. ON INFORMATION THEORY, February 1, 2008 7 

For two distributions Po, n (yn) = A/"(0, Sg,n) arid Pi,n(y n ) = A/"(0, Ei, n ), the error exponent is 
given by the almost-sure limit of the Kullback-Leibler rate: 

1, P0,n, s 1/1, /det(S iin )\ 1 r, i i. \ , . 

under po,n [30-34]. Using the asymptotic distribution of the eigenvalues of a Toeplitz matrix 
[12, 13], we have 

lim - log(det(S,- „)) = ±- /" * log Sjp(u)du, j = 0, 1, (14) 

n-»oo n Z7T Jo 

where Sy(uj) is the spectrum of {yi] which is assumed to have finite lower and upper bounds 
under distribution y n ~ pj. n . The limiting behavior of n~ 1 y^Sj^y n is also known and is given 
by (assuming that the true distribution of {yi} is po,n) 



lim -y^r> = -L / * %M tL} > (15) 

n ^°° n ' 2 Wo 4 (^) 

lim ^y^o>n = 1, (16) 



n— >oo n 



where the limit is in the almost-sure sense convergence under Hq, provided that S y (uj) and 
S y (uj) are continuous and strictly positive. (See Lemma 1 and 2 in [14] and Prop. 10.8.2 and 
10.8.3 in [4].) Combining (13)-(16), the error exponent for two zero-mean stationary Gaussian 
processes is thus given by 

K = lim -log^^(y n ) a.s. \po, n ], (17) 

n-+oo n Pin 

i /^A, s y 0) (uj) l\ 

= — / - log L 1 + ' \du, (18) 

= i-jf* 2?(>V(0, < S(°)(o;))||Ar(0 ) 4 1 )(a;)))^. (19) 

Intuitively, the error exponent (19) can be explained from (11) using the frequency binning 
argument used to obtain the channel capacity of Gaussian channel with colored noise from that 
of independent parallel Gaussian channels [3]. 

The spectral form (19) of the error exponent is valid for a wide class of stationary Gaussian 
processes including the autoregressive moving average (ARMA) processes and the hidden Markov 
model (1) - (4). For the detection (1) under the scalar state space model (4), we have 

^0,n = I> ^l,n = ^s,n + I) (20) 
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where E s>n (Z,m) = Iloa'' _m ', I, m = 1, • • • ,n, and two spectra under Hq and H\ are given by 
(9). However, it is not straightforward to obtain a closed- form expression for (19) except in some 
special cases, e.g., when both of the two distributions of {yi} under Hq and H\ have the Markov 
property [35]. 

In the remainder of the paper, we focus on the derivation of a closed- form expression for the 
error exponent K of the miss probability for (1) - (4) by exploiting the state-space structure 
under the alternative hypothesis. We do so by making a connection with Kalman filtering [11]. 
Our expressions will allow us to investigate the properties of the error exponent. 

A. Closed-Form Error Exponent via Innovations Approach 

Theorem 1 (Error exponent) For the Neyman-Pearson detector for the hypotheses (1) with 
level a £ (0, 1) (i.e. Pp < a) and < a < 1, the error exponent of the miss probability is given 
by 

K = I log ^ + I^£_I, (21) 
2 & a 2 2 R e 2 y ' 

independently of the value of a, where R e and R e are the steady-state variances of the innovations 
process of {yi} calculated under Hi and Hq, respectively. Specifically, R e and R e are given by 

R e = P + a 2 , (22) 

and 

l " 1 ( H w^y )- (23) 

where 

P = IVk 2 (l-a 2 )-Q] 2 + 4^Q- \a\l - a 2 ) + |. (24) 

Here, P is the steady-state error variance of the minimum mean square error (MMSE) estimator 
for the signal Sj under the model H\. Note that the error exponent (21) is thus a closed- form of 
(19) for the state-space model. 

Proof: See the Appendix. 

Theorem 1 follows from the fact that the almost-sure limit (3) of the normalized log-likelihood 
ratio under Hq is the error exponent for general ergodic cases [31-34]. To make the closed-form 
calculation of the error exponent tractable for the hidden Markov structure of {yi}, we express 
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the log-likelihood ratio through the innovations representation [6]; the log-likelihood ratio is 
given by a function of the sum of squares of the innovations on which the strong law of large 
numbers (SLLN) is applied. The calculated innovations are true in the sense that they form an 
independent sequence only under Hi, i.e., when the signal actually comes from the state-space 
model. It is worth noting that R e is the steady-state variance of the "innovations" calculated 
as if the observations result from the alternative, but are actually from the null hypothesis. In 
this case, the "innovation" sequence becomes the output of a recursive (whitening) filter driven 
by an i.i.d. process {yi} since the Kalman filter converges to the recursive Wiener filter for 
time-invariant stable systems [2]. 

The relationship between the spectral-domain approach and the innovations approach is ex- 
plained by the canonical spectral factorization, which is well established for the state-space model. 
The asymptotic variance of the innovations sequence is the key parameter in both cases. The 
relationship between the asymptotic performance of the Neyman-Pearson detector and that of 
the Kalman filter is evident in (21) for the state-space model. In both cases, the innovations 
process plays a critical role in characterizing the performance, and the asymptotic variance of the 
innovation is sufficient for the calculation of the error exponent for the Neyman-Pearson detector 
and the steady-state error variance for the Kalman filter. 

IV. Properties of Error Exponent 

In this section, we investigate the properties of the error exponent derived in the preceding 
section. We particularly examine the large sample error behavior with respect to the correlation 
strength and SNR. We show that the intensity of the additive noise significantly changes the 
error behavior with respect to the correlation strength, and the error exponent has a distinct 
phase transition in behavior with respect to the correlation strength depending on SNR. 

Theorem 2 (K vs. correlation) The error exponent K is a continuous function of the corre- 
lation coefficient a (0 < a < 1) for a given SNR > 0. The error exponent as a function of 
correlation strength is characterized by the following: 

(i) For SNR T > 1, K is monotonically decreasing as the correlation strength increases (i.e. 
a| 1); 

(ii) For SNR V < 1, there exists a non-zero value a* of the correlation coefficient that achieves 
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the maximal K, and a* is given by the solution of the following equation. 



[l + a 2 + r(l-a 2 )] 3 -2 r e + - =0, 




(25) 



where r e = R e /o~ 2 . Furthermore, a* converges to one as F goes to zero. 
Proof: See the Appendix. 

We first note that Theorem 2 shows that an i.i.d. signal gives the best error performance for 
a given SNR > 1 with the maximal error exponent being D(M(0, o~ 2 )\ |A/"(0, IIo + o~ 2 )). (In this 
case, Theorem 1 reduces to Stein's lemma.) The intuition behind this result is that the signal 
component in the observation is strong at high SNR, and the innovations (the new information) 
provide more benefit to the detector than the noise averaging effect present for correlated obser- 
vations. That is, simple radiometry provides sufficient detection power when the signal level is 
above that of the noise. Fig. 1 shows the error exponent as a function of the correlation coef- 
ficient a for SNR T = 10 dB. The monotonicity of the error exponent is clearly seen; moreover, 
we see that the amount of decrease becomes larger as a increases. Notice also that the amount 
of performance degradation from the i.i.d. case is not severe for weak correlation and the error 
exponent decreases suddenly near a = 1 and eventually becomes zero at a = 1. (It is easy to 
show that the miss probability decays with 0(~t=) for any SNR at a = 1.) 



SNR= 10dB 



D(p || Pl ) 



LU 



CD 



8.0.5- 
x 



o 



CD 












0.5 

Correlation coefficient, a 



Fig. 1 



K VERSUS CORRELATION COEFFICIENT a (SNR=10 DB): p =W(0,(7 2 ), pi = W(0, FT + a 2 ) 
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In contrast, the error exponent does not decrease monotonically in a for SNR < 1, and there 
exists an optimal correlation as shown in Fig. 2. It is seen that the i.i.d. case no longer gives 
the error performance for a fixed SNR. The error exponent initially increases as a increases, 
and then decreases to zero as a approaches one. As the SNR further decreases (see the cases 
of -6 dB and -9dB) the error exponent decreases for a fixed correlation strength, and the value 
of a achieving the maximal error exponent is shifted closer to one. At low SNR the noise in 
the observation dominates. So, intuitively, making the signal more correlated provides greater 
benefit of noise averaging. The lower the SNR, the stronger we would like the correlation to be 
in order to compensate for the dominant noise power, as shown in Fig. 2. However, excessive 
correlation in the signal does not provide new information by observation, and the error exponent 
ultimately converges to zero as a approaches one. Notice that the ratio of the error exponent 
for the optimal correlation to that for the i.i.d. case becomes large as SNR decreases. Hence, 
the improvement due to optimal correlation can be large for low SNR cases. Fig. 3 shows the 
value of a that maximizes the error exponent as a function of SNR. As shown in the figure, unit 
SNR is a transition point between two different behavioral regimes of the error exponent with 
respect to correlation strength, and the transition is very sharp; the optimal correlation strength 
a approaches one rapidly once SNR becomes smaller than one. 

The behavior of the error exponent with respect to SNR is given by the following theorem. 
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Fig. 3 



Optimum correlation strength versus SNR 



Theorem 3 (K vs. SNR) The error exponent K is monotonically increasing as SNR increases 
for a given correlation coefficient < a < 1. Moreover, at high SNR the error exponent K 



Proof: See the Appendix. 

The detrimental effect of correlation at high SNR is clear. t The performance degradation due to 
correlation is equivalent to the SNR decreasing by factor (1— a 2 ). The log(l+ SNR) increase of K 
w.r.t. SNR is analogous to similar error-rate behavior arising in diversity combining of versions 
of a communications signal arriving over independent Rayleigh-faded paths in additive noise, 
where the error probability is given by P e ~ (1 + SNR)~ L and L is the number of independent 
multipaths. In both cases, the signal component is random. The log SNR behavior of the 
optimal Neyman-Pearson detector for stochastic signals applies to general correlations as well 
with a modified definition of SNR. Comparing with the detection of a deterministic signal in 
noise, where the error exponent is proportional to SNR, the increase of error exponent w.r.t. 
SNR is much slower for the case of a stochastic signal in noise. Fig. 4 shows the error exponent 



with respect to SNR for a given correlation strength. The log SNR behavior is evident at high 
SNR. 



Interestingly, the error exponent at high SNR has the same expression as the capacity of the Gaussian channel. 



increases linearly with respect to ^ 



±log[l + SNR(l-a 2 )]. 
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Fig. 4 

K VERSUS SNR (a = e" 1 ) 



V. Extension to The Vector Case 

In order to treat general cases in which the signal is a higher order AR process or the signal is 
determined by a linear combination of several underlying phenomena, we now consider a vector 
state-space model, and extend the results of the previous sections to this model. The hypotheses 
for the vector case are given by 

H : yi = w h i = 1,2, ■■■,n, 

(26) 

Hi : yi = h T Sj + Wi, 

where h is a known vector and Sj = [su, S2i, • • • , s m i] T is the state of an m-dimensional process 
at time i following the state-space model 

Bi+i = Asi + Bui, (27) 
si - AA(0, n ), 
u< '~ A AT(0, Q), Q > 0. 

We assume that the feedback and input matrices, A and B, are known with |Afc(A)| < 1 for 
all k, and the process noise {uj} independent of the measurement noise {wi}. We also assume 
that the initial state si is independent of Uj for all i, and the initial covariance IIo satisfies the 
following Lyapunov equation 

n = AII A r + BQB T . (28) 
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Thus, the signal sequence {s,} forms a stationary vector process. In this case the SNR is defined 
similarly to (5) as h JI" h - The autocovariance of the observation sequence {yi} is given by 

{a 2 5a under Hn, 

T , , , ( 29 ) 

h T Al 4 -jln h + a 2 5 lj under H u 

where Sij is the Kronecker delta. Thus, the covariance matrix of the observation under H\ 

is a symmetric Toeplitz matrix with h^A'lIoh as the Zth off-diagonal entry (I ^ 1). Since 

|A fc (A)| < 1 for all k, q = h T A'n h is an absolutely summable sequence and the eigenvalues of 

the covariance matrix of y n is bounded both from below and from above. 

Theorem 4 (Error exponent) For the Neyman-Pearson detector for the hypotheses (26) and 
(27) with level a G (0, 1) (i.e. Pp < a) and |Afc(A)| < 1 for all k, the error exponent of the miss 
probability is given by (21) independently of the value of a. The steady-state variances of the 
innovation process R e and R e calculated under Hi and Hq, respectively, are given by 

R e = a 2 + h T Ph, (30) 

where P is the unique stabilizing solution of the discrete-time algebraic Riccati equation 

_ a „ aT APhh r PA T , N 

~ + BQB - hTph + j2 , (31) 

and 

fle = <7 2 (l+h T Ph), (32) 
where P is the unique positive-semidefinite solution of the following Lyapunov equation 

P = (A-K p h T )P(A-Kph T ) T + K p Kj, (33) 

and K p = APhR' 1 . 

In spectral form, K is given by (19), where Sy°\uj) = a 2 (—ir < uj < it) and Sy (oj) is given 
by 

f Q 

a 2 



5«(a;) = [h T (^I-A)- 1 1] 



(e-^I - A T )- 1 h 
1 



(34) 



Proof: See the Appendix. 



For this vector model, simple results describing the properties of the error exponent are not 
tractable since the relevant expressions depend on the multiple eigenvalues of the matrix A. 
However, (21), (31) and (33) provide closed-form expressions for the error exponent which can 
easily be explored numerically. 

February 1, 2008 DRAFT 



TO APPEAR IN IEEE TRANS. ON INFORMATION THEORY, February 1, 2008 15 

VI. Simulation Results 

To verify the behavior of the miss probability predicted by our asymptotic analysis, in this 
section we provide some simulation results. We consider the scalar model (4), for SNR of 10 dB 
and - 3 dB, and for several correlation strengths. The probability of false alarm is set at 0.1% 
for all cases we consider. 

10° 

10"' 

s 10-* 

10" 3 



10 20 30 40 50 

Number of samples 

Fig. 5 

Pm VS. NUMBER OF SAMPLES (SNR=10dB) 

Fig. 5 shows the simulated miss probability as a function of the number of samples for 10 dB 
SNR. It is seen, as predicted by our analysis, that the i.i.d. case (a = 0) has the largest slope 
for error decay, and the slope is monotonically decreasing as a increases to one. Notice that 
the error performance for the same number of observations is significantly different for different 
correlation strengths for the same SNR, and the performance for weak correlation is not much 
different from the i.i.d. case, as predicted by Fig. 1. It is also seen that the miss probability for 
the perfectly correlated case (a = 1) is not exponentially decaying, again confirming our analysis. 

The simulated error performance for SNR of -3 dB is shown in Fig. 6. It is seen that the 
asymptotic slope of log Pm increases as a increases from zero as predicted by Theorem 2, and 
reaches a maximum with a sudden decrease after the maximum. Notice that the error curve 
is still not a straight line for the low SNR case due to the o(n) term in the exponent of the 
error probability. Since the error exponent increases only with log SNR, the required number of 
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Number of samples 



Fig. 6 

Pm vs. number of samples (SNR=-3dB) 



observations for -3 dB SNR is much larger than for 10 dB SNR for the same miss probability. 
It is clearly seen that Pm is still larger than 10~ 2 for 200 samples whereas it is 10 -4 with 20 
samples for the 10 dB SNR case. 

VII. Conclusion 

We have considered the detection of correlated random signals using noisy observations. We 
have derived the error exponent for the Neyman-Pearson detector of a fixed level using the 
spectral domain and the innovations approaches. We have also provided the error exponent in 
closed form for the vector state-space model. The closed-form expression is valid not only for 
the state-space model but also for any orthogonal transformation of the original observations 
under the state-space model, since the spectral domain result does not change by orthogonal 
transformation and Theorem 1 is a closed-form expression of the invariant spectral form. We 
have investigated the properties of the error exponent for the scalar case. The error exponent is 
a function of SNR and correlation strength. The behavior of the error exponent as a function of 
correlation strength is sharply divided into two regimes depending on SNR. For SNR > 1 the 
error exponent is monotonically decreasing in the signal correlation. On the other hand, for SNR 
< 1, there is a non-zero correlation strength that gives the maximal error exponent. Simulations 
confirm the validity of our asymptotic results for finite sample sizes. 
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Appendix 

Proof of Theorem 1 

Since the error exponent for the Neyman-Pearson detector with a fixed level a € (0, 1) is given 
by the almost-sure limit of the normalized log-likelihood ratio ^-logL n (y n ) under Hq (if the 
limit exists) [31-34], we focus on the calculation of the limit. We show that ^logL n converges 
a.s. under Hq for Gauss-Markov signals in noise using the limit distribution of the innovations 
sequence. The log-likelihood ratio is given by 

logL n (y n ) = logpi, n (y n ) - logp ,n(yn)- (35) 

We have, for the second term on the right-handed side (RHS) of (35), 



and so, under H\ 



o- 



1 1 1 n 

-logpo,n(y) = --log(2w 2 ) - — j-^yl (36) 



n 2 2a 2 n 

i=i 



-^log(2w 2 )-i a.s., (37) 



since - X^j=i Vi ~^ ^o{^i } = °~ a - s - under Hq as n — > oo by the SLLN. Now consider the first 
term on the RHS of (35). The log-likelihood under Hi can be obtained via the Kalman recursion 
for the 
Then, 



for the innovations [1], [6]. Specifically, define k = logpi ti (yi,y 2 , ■ ■ • , yi) and y\ = {yi, y 2 , ■ ■ ■ , yi}. 



PiM) = Pi,*(yr V^G/ibr 1 ), 2 < % < n. (38) 

Hence, 

k = k-i + logpi Myl' 1 ), 2<i<n, (39) 

where l\ = logpi^yi). Since the joint distribution of {y±,y 2 , ••• , yi} is Gaussian, the conditional 
distribution pi,i(yi \ y\ _1 ) is also Gaussian with mean y^i and variance R e ,i- h is expressed using 
the innovations representation by 

k = k-i ~ \ log(2irR eti ) ( 4 °) 
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where the minimum mean square error (MMSE) prediction yju_i of yi is the conditional expec- 
tation Eijj/jl?/^ 1 } and the innovation is given by e% = y% — with variance R £: i = ~Ei{e 2 }. 
Hence, 

1 11 1 n l n ? 

l„ gM ,„(y„) = i = -- l„g(2.) - - £log ft, - - £ (41) 

i=l i=l 



e . i 



The second term on the RHS of (41) is not random, and we have 

1 - 1 

— ^logit^^-log^, (42) 

i=l 

by the Cesaro mean theorem since R e ^ —>■ R e and R e j > a 2 > for all i where i? e is given by 

R e = P + a 2 , (43) 

and where P is the steady-state error variance of the optimal one-step predictor for the signal 
{■Si}. Now representing ei as a linear combination of y±, y 2 , ■ ■ ■ , y% gives 

e; = yi - Kp^y^i - (a- K Pji _ 1 )K p ^_ 2 yi-2 - (a - K P) j_ 1 )(a - K p ^_ 2 )K p ^ 3 yi^ 3 

- (a - K Pi i_i)(a - K P) i- 2 ){a - K p ^- 3 )K p ^-±yi-A , (44) 

where = aPiR~\ is the Kalman prediction gain, Pj = Ei{(,Sj — Sj|j_i) 2 } is the error variance 
at time i, Sj|j_i is the linear MMSE prediction of Si given y 1 ^ 1 ■ Since the Kalman filter converges 
asymptotically to the time- invariant recursive Wiener filter for < a < 1 , we have asymptotically 

ej = yi - K p yi_i - (a - K p )K p yi_ 2 ~ {a - K p ){a - K p )K p yi_z 

-(a-K p )(a-Kp)(a-K p )K pyi ^ , (45) 

where K p is the steady-state Kalman prediction gain. Thus, under Hq the innovations sequence 
becomes the output of a stable recursive filter driven by an i.i.d. sequence {yi}, and it is known 
to be an ergodic sequence. By the ergodic theorem, i Ym=i e ? converges to the true expectation, 
which is given by 



A" 



R e = hm EoK} = a z 1 + p (46) 

i^oo \ \ — [a — K p ) z I 

since {yi}^i is an independent sequence under Hq. Substituting K p and R e , we have 

2( a 2 P 2 \ 2 ( a 2 P 2 \ 

R * = ° { 1 + Rj^)= a { 1 + P 2 + 2a 2 P + (l-a^ )- (4?) 
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Now, the last term on the RHS of (41) is given by 

±V^L 1 f R e 1 A e 2 R e + Ce n -Ce n 

2n^R ei 2n^R e R ei 2n ^ R e R e + Ce n { ' 

i=l ' i=l ' i=l 

1 n \ n ^ Ce n 

2 e * ~ 97Tr~ ^ e i P —I— n.n ' ( 49 ) 



2ni? e ^ 1 2nR e ^ 1 R e + Ce n ' 

i=l i=l 



1 

47T 


,2* ^2 

■A) S { y\u 


-did — 

) 


1 

2' 




4-7T „ 


r2n a 2 


dco — 


— loe 

2 e 


2 1 
;(T "2 



where some positive constant C and |e| < 1 by the exponential convergence of R e ^ to R e . The first 
term in (49) converges to and the second term converges to zero since Y27=i e f l n converges 
to a finite constant and R e > a 2 > 0. Hence, (21) follows for < a < 1. When a = 1, we have 
P M ~ 6 f^=) and K = 0. We also have P = 0, R e = R e = a 2 in (22) - (24) at a = 1. Thus, 
(20) has a value of zero at a = 1, and Theorem 1 holds for < a < 1. 

Now we show that (21) is equivalent to the spectral domain result (19) using spectral factor- 
ization. Prom the spectral domain form (19) we have 

4vr ./o cr 2 

r 2w 

(50) 

First, consider the first term on the RHS of (50). The argument of the logarithm is the power 
spectral density of the observation sequence {yi} under H±. From Wiener filtering theory, the 
canonical spectral factorization for Sy (z) is given by ( [2], p. 275) 

SW(Z) = L(z)R e L*(z~*), (51) 

where L~ 1 (z) is the whitening filter. Hence, we have 

\ogS y l \u)duj 

■2w 

log(L(e^)R e L*(e juJ ))duj, 

•2w 

(logR e + \ogL(e? u ) + log L*(e> w ))duj, 

log R e , 



4tt 
1 

47T 

1 
47T 
1 

2 



where the last step follows from the cancellation of two other terms in para-Hermitian conjugacy. 
Now, consider the second term on the RHS of (50). From (51), we have 



a 



2 



a 2 L~ l {z){L*{z~*)y l 



Re, 



(52) 



February 1, 2008 DRAFT 



TO APPEAR IN IEEE TRANS. ON INFORMATION THEORY, February 1, 2008 20 

which is the spectral density of the innovations process under Hq divided by R e , since {yi} is an 
i.i.d. sequence with variance a 2 under Hq and L~ 1 {z) is the whitening filter. Since the variance 
of a stationary process is given by the autocovariance function r(l) setting Z = 0, we have, by 
the definition of R e , 

Re = r(0) = ^ fj ' <J 2 [L-\z){L*{z-*)r% =e3 ^ (53) 

since the spectral density is the Fourier transform of the autocovariance function. (Eq. (23) is 
an explicit formula for (53).) Hence, we have 

and (50) is given by 

1 i ^ 1 Re 1, 9 1 1 , Re 1 Re 1 /-^ 

- log R e -\ log <T = - log — I H (55) 

2 B 2 i? e 2 & 2 2 ct 2 2 i? e 2 v; 

which is the error exponent in Theorem 1 that we derived using the innovations approach. ■ 

Lemma 1: The partial derivative of the error exponent with respect to the correlation coeffi- 
cient a is given by 

3K T{b-a) ( 1 2(1 -06) 

(56) 



da r e (l-6 2 ) \l-ab r e (l - b 2 ) 2 
for a fixed SNR T, where b = a/r e and r e = R e /a 2 . 

It is easily seen that the partial derivative ^ is a continuous function of a for < a < 1 since 
r e is a continuous function of a from (22) and (24). 

Proof of Lemma I 

We use the spectral domain form for the error exponent. 

Wfcr ^2 ! ,2. ^2 ! 

^ = ~~T \ lo S^ o / x + IT \ ~o — - — -aw — — , 

47r7 b a 2 + S s (uj) 4n J a 2 + S s (lu) 2 

]_ /-27T J 1 /-27T 1 j 



/ log = dw H / = dw , (57) 

4tt J & l + TS s (u;) 47r7 l + rS» 2' 



where 

n (l - a 2 ) Sf.A-a,. a /tt _ n 

1 — 2a cos + a 

The spectral density of the observation sequence {yi} is given by 



&(")= , — . , -2 . s,M = s«M/n„, r = -H. (58) 



= * 2 + ^> = ^l+ (1 _J^_ oz) ). (59) 
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where Q = Ilo(l — a 2 ), and its canonical spectral factorization is given by ( [2], p. 242) 

1-bzl- bz- 1 
1 — az 1 — az 1 



S^(z) = a 2 L(z)r e L*(z~*) = o 2 r e \ -I bz _\ , (60) 



where b = a/r e (\a\ < 1 and \b\ < 1) and 



_ y/[l + a 2 + Q/a 2 } 2 - 4a 2 + 1 + a 2 + Q/a 2 _ R e 

Te- 2 - a2 ■ [bl) 

The partial derivative of K with respect to a is given by 

d 4- = J. r r§ '^ *, - -i r (62) 

da 4ttJ 1 + TS s (lu) 4tt J (1 + TS s (u;)) 2 



- — al - ~ KZZ , ^2\2 ■ vO-ij 



where 

dS s (u;) _ 2[(1 + a 2 )cosw - 2a] 
9a (1 — 2a cos u + a 2 

Consider the first term on the RHS of (62). Using the canonical spectral decomposition (60), we 

have 



o ~, , s o 2r[(l+a 2 )coso;-2a] 

1 f 2 * rS' e (u) , 1 f 2n (l-2acLu,+a?)* 



1 /• r[(l + a 2 )(z + z" 1 ) -4a] dz 



4vr J r e (l - az)(l - az~ 1 )(l - bz)(l - bz- 1 ) jz' 
1 / r[(l + a 2 )(z 2 + l)-4az] ^ 



4irj J r e (l — az)(z — a)(l — 6z)(z — b) 

^ p 

2irj y Residues of integrand, 

J |*|<1 

1-a 2 (l + a 2 )(l+o 2 ) -4a6 



2r e \(l-a6)(a-6) (1 - ob)(6 - a)(l - b 2 ) 
r (6 - a) 



(64) 



r e (1 - ob)(l - b 2 )' 

where we have substituted z = e 3UJ , and used the residue theorem. The second term on the RHS 

of (62) is similarly obtained: 

1_ [** Y~S'M , J_ / r[(l+a 2 )(z + z- 1 )-4a] dz 

47r7 (l + r5 s H) 2 4vrJ r 2 (l - 6z) 2 (l - bz- 1 ) 2 jz' 

T /• (l + a 2 )(z 2 + l)-4az 
4vrr 2 jJ (l-fe) 2 (z-6) 2 dz ' 

r 

27rjRes(z = 6), 



4vrr 2 j 

where 

_ , [ d (l + a 2 )(z 2 + l)-4azl 4(6-a)(l-a6) 

Res(z = b) - 



dz (1 - te) 2 



=6 (1 " & 2 ) 3 



(65) 
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Hence, K' is given by 

dK T{b-a) ( 1 2(1 -ab) 



da r e (l-b 2 ) \l-ab r e (l - b 2 f 



(66) 



Proof of Theorem 2 

First, the continuity of K in (21) is straightforward as a function of R e and R e since R e > a 2 , 
and the continuity of R e and R e is also trivial as a function of a and P from (22) and (23). 
Thus, we need only to show the continuity of P as a function of a, i.e., the nonnegativity of the 
argument of the square root in (24). The argument can be rewritten as 

[a 2 (l - a 2 ) - Q] 2 + 4a 2 Q = a\l - a 2 )[(l - a 2 )(l - V) 2 + 4T], (67) 

which is nonnegative if either V = 1 or — m i n ae[o,i](l — fl2 ) = if T ^ 1. Thus, K is a 

continuous function of a (0 < a < 1) for any SNR T > 0. 
(i) SNR > 1; 



(66), 


we have 








dK 


T(b 


-a) / 


1 


26(1 -a6)\ 


da 


r e (l 


-b 2 ) \ 


1-ab 


a(l-6 2 )V ' 




r(6 


— a) [a 


(i - b 2 : 


) 2 -26(l-a6) 2 )] 




r e (l 


-b 2 ) 


a(l - 


-a6)(l -6 2 ) 2 




r(6 


— a) [a 


(1 + b 2 . 


) 2 -26(l + a 2 6 2 )] 




r e (l 


-b 2 ) 


a(l - 


-a6)(l -6 2 ) 2 




r(6- 


- a)[br e \ 




i 2 -26(l + a 2 6 2 )] 



r e a(l-a6)(l-6 2 ) 3 
T(6 - a)6r^ 1 [(l + Q + a 2 ) 2 - 2r e (l + a 4 r~ 2 )] 



(68) 



r e a(l-a6)(l-6 2 ) 3 

where Q = T(l — a 2 ) and we have used the relation r e (l + 6 2 ) = 1 + Q + a 2 in the canonical 
spectral factorization. (See [2] p. 242.) We also have the relation 

a 2 

r e + — = l + Q + a 2 , (69) 

r e 

which implies 

r e (l + a 4 r- 2 ) < 1 + Q + a 2 , (70) 
since < a < 1. Hence, for the last term in the numerator of (68) we have 

(1 + Q + a 2 ) 2 - 2r e (l + a 4 r" 2 ) > (1 + Q + a 2 ) 2 - 2(1 + Q + a 2 ). (71) 
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The RHS of (71) is positive for 1 + Q + a 2 = l + a 2 + r(l — a 2 ) > 2, which reduces to the condition 
r > 1. Since 

r e = R e /a 2 = 1 + P/a 2 > 1 for < a < 1, (72) 

we have 

b - a = (r" 1 - l)o < 0. (73) 
Hence, ^ < for < a < 1 and T > 1, and K is monotonically decreasing as a f 1 for T > 1. 

(tij SiVi? < 1: 

For a given T, denote the last term in the numerator of (68) by 

f T (a) = (1 + Q + a 2 ) 2 - 2r e (l + a 4 r^ 2 ). (74) 

Then, we can write 

/r(a = 0) = (l + r) 2 -2(l + r) = T 2 -1, (75) 

since Q = T(l — a 2 ) and r e = 1 + T for a = from (61). We have /r(a = 0) < for T < 1 and 
^ > from (68) since b — a < 0. Hence, K increases as a increases in the neighborhood of a = 
with K(a = 0) = D(M{0, 1)||AA(0, 1 + T) > if < T < 1. However, K j as a approaches one 
since Pm ~ 0(-^) at a = 1. Hence, K achieves a maximum at nonzero a for SNR T < 1 since 
K is a continuous function of a, and the value of a achieving the maximum is given by /r(a) = 
since is also continuous with a. 
As SNR r | 0, we have 

Q = r(l - a 2 ) 10 and r e = 1 + P/a 2 | 1. (76) 

The last term in the numerator of (68) is given by 

(l + g + a 2 ) 2 -2r e (l + a 4 r- 2 ) -> (1 + a 2 ) 2 - 2(1 + a 4 ), 

= -(1 - a 2 ) 2 < 0, (77) 

for < a < 1 as T | 0. Hence, for any 5 > 0, there exists Tq small enough such that for 
all r < r (<5), (1 + Q + a 2 ) 2 - 2r e (l + a 4 r" 2 ) < -(1 - a 2 ) 2 + 5. This guarantees that for 
r < r (<5), SK/aa > for < a < y/l-y/S, and a* > y/l-y/6. ■ 



Proof of Theorem 3 
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Let s = 1 + TS s (uj) where S s (lu) is given by (58). Then, from (57), the partial derivative of K 
w.r.t. r is given by 

dK 1 f 2n 8 ( 1 , 1 1 1\ ds , 



where 



and 



9/1, 1 11 1\ ls-1 lTS s (u) n 

— - log- + ---- = = _^>0, 79 

os \ 2 s 2 s 2 1 2 s z 2 s z 



dT s[UJ) l-2acos(uj) +a 



Ss(w) = -1 n „ | _ 2 > (80) 



for < a < 1. Hence, 



> 0, (81) 



and the error exponent K increases monotonically as SNR increases for a given a (0 < a < 1). 
At high SNR, we have 

P ~ Q, 

fi e ~ cr 2 (l + a 2 ). 

Hence, from (21), the error exponent is given at high SNR by 

1 Q + a 2 la 2 (l + a 2 ) 



K ~ — loe h 

2 S a 2 2 Q + a 2 

l^_Q + a 2 11 + a 2 



l , A . n 2 A . l l + a : 



_ log 1 + ^(1-^) + _ . (82) 

2 S V ^ 2 7 2 1 + ^(1 -a 2 ) V ^ 

Since the first term is dominant at high SNR, the theorem follows. ■ 
Proof of Theorem 4 

Since the error exponent is given by the asymptotic Kullback-Leibler rate (3) and its represen- 
tation by innovations for the vector case is the same as (36) and (41). We need only to calculate 
R e and R e for the vector model. 

The steady-state variance R e for the innovations under Hi is given by the conventional result 
of the state-space model 

R e = h T Ph + <r 2 , (83) 
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and P is the unique Hermitian solution of the discrete-time Riccati equation 

nn nn APVlh T PA T 

P = APA^ + BQB^- A J^ + P ^ 2 , (84) 

such that A — K p h T is stable (the existence of the solution is guaranteed since A is stable, 
diag(Q,cr 2 ) > 0, and Sy~\u) > due to the additive noise. See [2] p. 277.), where 

K p = APhR' 1 . (85) 

For i? e , we again represent as a linear combination of y±, j/2, • • • , Hi, and ej is given by 

&i = yi- h T K Pii _iyi„i - h T (A - K Pii _i h T )K Pii _ 2 yi-2 

- h T (A - K p , 4 _ih T )(A - K p , 4 _ 2 h T )K p , i _ 3 y J - 3 

- h T (A - K Pii _ih T )(A - K p>J _ 2 h T )(A - K Pii _ 3 h T )K p , i _ 4 y i -4 - • • • (86) 

where K Pj j is the Kalman prediction gain given by K Pii = APih/(h T Pih + a 2 ) with the one-step 
prediction error covariance matrix Pj [2]. Since the Kalman filter converges asymptotically to 
the time-invariant recursive Wiener filter when A is stable, we have asymptotically 

&i = yi - h T K p yj_! - h T (A - K p h T )K p yj_ 2 

- h T (A - K p h T )(A - K p h T )K p?/i „3 

- h T (A - K p h T )(A - K p h T )(A - K p h T )K p y 4 „ 4 - • • • (87) 

where K p is the steady-state Kalman prediction gain. Thus, the innovation sequence becomes 
the output of a stable recursive filter driven by the i.i.d. sequence {yi\ under Hq as in the scalar 
case, and the ergodic theorem holds for - Y27=i e ?- 

R e = limE {e 2 }, 

i— >oo 

= a 2 + a 2 h T (jT(A - K p h r ) fc K p K^[(A - K p h T ) k f] h. 

U=o / 

Let P be defined as 

oo 

P i £(A - K p h^K p Kj[(A - K p h T ) k f- 

k=0 

P is finite since A — K p h T is stable by the property of the solution of (84), and is given by the 
unique solution of the following Lyapunov equation 

P-(A-K p h T )P(A-K p h T ) T = K p Kj. (90) 
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(Since A — K p h T is stable and K p K^ > 0, there exists a unique, Hermitian, and positive semi- 
definite solution P of (90) [2].) The spectrum for the vector model is given by (34) [2]. 

■ 
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